Picture for Kaipeng Zhang

Kaipeng Zhang

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

Add code
May 28, 2026
Viaarxiv icon

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Add code
May 21, 2026
Viaarxiv icon

WorldMark: A Unified Benchmark Suite for Interactive Video World Models

Add code
Apr 23, 2026
Viaarxiv icon

Generative World Renderer

Add code
Apr 02, 2026
Viaarxiv icon

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Add code
Mar 26, 2026
Viaarxiv icon

WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG

Add code
Mar 24, 2026
Viaarxiv icon

PyVision-RL: Forging Open Agentic Vision Models via RL

Add code
Feb 24, 2026
Viaarxiv icon

OmniCustom: Sync Audio-Video Customization Via Joint Audio-Video Generation Model

Add code
Feb 12, 2026
Viaarxiv icon

Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models

Add code
Jan 12, 2026
Viaarxiv icon

ProSoftArena: Benchmarking Hierarchical Capabilities of Multimodal Agents in Professional Software Environments

Add code
Dec 30, 2025
Viaarxiv icon